Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Added Support accepting OTLP via Kafka #4049

Closed
wants to merge 2 commits into from

Conversation

vishal-chdhry
Copy link

Signed-off-by: Vishal Choudhary [email protected]

Adds two more recognized encodings, otlp-json and otlp-proto.

Which problem is this PR solving?

Resolves #3949

Signed-off-by: Vishal Choudhary <[email protected]>
@vishal-chdhry
Copy link
Author

@yurishkuro the decoders i managed to impliment returns ptrace.Traces but the return type should be *model.Span.
Can you please guide me on how to fix this

@yurishkuro
Copy link
Member

See

protoFromTraces: otlp2jaeger.ProtoFromTraces,

@vishal-chdhry vishal-chdhry changed the title [WIP] Added Support accepting OTLP via Kafka Feature: Added Support accepting OTLP via Kafka Nov 16, 2022
@vishal-chdhry
Copy link
Author

@yurishkuro Done! Can you please review it!

// EncodingOtlpJSON is used for spans encoded as OTLP JSON.
EncodingOtlpJSON = "otlp-json"
// EncodingOtlpProto is used for spans encoded as OTLP Proto.
EncodingOtlpProto = "otlp-proto"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to L69, so that they will appear in -h output.

if err != nil {
return nil, err
}
return batch[0].Spans[0], nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a problem. Does OTLP Kafka exporter allow writing batches of spans as a single Kafka message?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please help me in fixing this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be difficult to "fix" if by that you mean implementing support for receiving batches. The current consumer in Jaeger was designed to receive one span per message.

I suggest looking into whether OTEL Collector can be configured to send one span per message (Kafka exporter introduced in open-telemetry/opentelemetry-collector#1439), probably with some pipeline configuration. We would want to reference that in Jaeger docs to make it clear that batch per message is not supported. And we should log an error in the code above if more than one span is found in the batch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I was busy with my college exams, I will get on it right away

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into this a bit. I don't think OTEL has an option right now to split a batch into multiple Kafka messages, and I suggest that's what needs to happen. While it may be less efficient, going the other way (i.e. supporting batches in Jaeger) would break an invariant that we currently maintain that all spans from a given trace ID end up in the same Kafka partition. The current OTEL collector code won't be able to maintain that invariant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so can you please help me with understanding what i have to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make a change to OTEL Kafka exporter to support a config flag that would force de-batching of the spans into one span per message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, we could also implement batch handling in the ingester. There are two ways of doing that:

  1. change the unmarshaler signature a bit to allow returning arrays of spans and opportunistically try to save them one at a time. If one of them fails, the batch will be partially saved, and if retry happens the whole batch will be re-saved again. If this happens rarely enough it might be acceptable solution. We would need to verify how out metrics work, so that they don't count whole batch as +1.
  2. [much larger change] upgrade our Storage API to allow batches of spans. Just to be clear, I don't think it's worth doing for this ticket, but some storage backends do support batch inserts. E.g. for Elasticsearch there is no particular sharding scheme in place, so it can save a batch of random spans in a single node (risk of hot partitions though, and OOMs if batches are very large). For Cassandra, sending batch of spans for different trace IDs means the receiving node will become a coordinator, will reshard them as needed, and communicate with other Cassandra nodes. We tried to avoid this mode in our current implementation by making sure the db client does the sharding upfront, to minimize extra hops. I don't remember how either backend would handle atomicity in case of a batch save.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying the first of your suggestions.

change the unmarshaler signature a bit to allow returning arrays of spans

But I don't have the authority to commit to this branch.
Is it possible to get it?

Or I can create a new PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new PR is fine. You can cherry-pick the commits from the original PR to give credit to the original author

@yurishkuro
Copy link
Member

closing since there's no follow-up by the author, and someone else picked it up #4333

@yurishkuro yurishkuro closed this Mar 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Support accepting OTLP via Kafka
3 participants